AITopics | character hallucination

Collaborating Authors

character hallucination

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RoleBreak: Character Hallucination as a Jailbreak Attack in Role-Playing Systems

Tang, Yihong, Wang, Bo, Wang, Xu, Zhao, Dongming, Liu, Jing, Zhang, Jijun, He, Ruifang, Hou, Yuexian

arXiv.org Artificial IntelligenceSep-25-2024

Role-playing systems powered by large language models (LLMs) have become increasingly influential in emotional communication applications. However, these systems are susceptible to character hallucinations, where the model deviates from predefined character roles and generates responses that are inconsistent with the intended persona. This paper presents the first systematic analysis of character hallucination from an attack perspective, introducing the RoleBreak framework. Our framework identifies two core mechanisms-query sparsity and role-query conflict-as key factors driving character hallucination. Leveraging these insights, we construct a novel dataset, RoleBreakEval, to evaluate existing hallucination mitigation techniques. Our experiments reveal that even enhanced models trained to minimize hallucination remain vulnerable to attacks. To address these vulnerabilities, we propose a novel defence strategy, the Narrator Mode, which generates supplemental context through narration to mitigate role-query conflicts and improve query generalization. Experimental results demonstrate that Narrator Mode significantly outperforms traditional refusal-based strategies by reducing hallucinations, enhancing fidelity to character roles and queries, and improving overall narrative coherence.

character hallucination, hallucination, query, (15 more...)

arXiv.org Artificial Intelligence

2409.16727

Country:

Asia > China > Tianjin Province > Tianjin (0.05)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > Singapore (0.04)
Africa (0.04)

Genre:

Personal > Interview (0.93)
Research Report > New Finding (0.87)

Industry:

Health & Medicine (0.68)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models

Ahn, Jaewoo, Lee, Taehyun, Lim, Junyoung, Kim, Jin-Hwa, Yun, Sangdoo, Lee, Hwaran, Kim, Gunhee

arXiv.org Artificial IntelligenceMay-28-2024

While Large Language Models (LLMs) can serve as agents to simulate human behaviors (i.e., role-playing agents), we emphasize the importance of point-in-time role-playing. This situates characters at specific moments in the narrative progression for three main reasons: (i) enhancing users' narrative immersion, (ii) avoiding spoilers, and (iii) fostering engagement in fandom role-playing. To accurately represent characters at specific time points, agents must avoid character hallucination, where they display knowledge that contradicts their characters' identities and historical timelines. We introduce TimeChara, a new benchmark designed to evaluate point-in-time character hallucination in role-playing LLMs. Comprising 10,895 instances generated through an automated pipeline, this benchmark reveals significant hallucination issues in current state-of-the-art LLMs (e.g., GPT-4o). To counter this challenge, we propose Narrative-Experts, a method that decomposes the reasoning steps and utilizes narrative experts to reduce point-in-time character hallucinations effectively. Still, our findings with TimeChara highlight the ongoing challenges of point-in-time character hallucination, calling for further study.

event summary, hallucination, time point, (16 more...)